Dimensionality Reduction
There are two types of Dimensionality Reduction techniques:
- Feature selection
- Feature extraction
Feature Selection
- Backward Elimination, Forward Selection, Bidirectional
- Elimination, Score Comparison
Feature Extraction
-
PCA vs. LDA
-
PCA capture the variability; LDA class separation
-
PCA is unsupervised; LDA is supervised (because of the relation to the dependent variable)
-
-
t-SNE
- = T-Distributed Stochastic Neighbor Embedding
- t-SNE takes high-dimensional data and reduces it to a low-dimensional graph (2-D typically)
- Unlike PCA (which is linear), t-SNE can reduce dimensions with non-linear relationships (such as “Swiss Roll” non-linear distribution)
- it calculates a similarity measure based on the distance between points instead of trying to maximize variance.